Sandglass: Twin Paraphrasing Spoken Language Translation
نویسندگان
چکیده
This paper proposes a new machine translation design that is the core architecture in an on-going project named Sandglass. The Sandglass system places special emphasis on monolingual processing and is designed to e ectively deal with spoken languages. The system has good portability from modularity provided by a natural language protocol and monolingual processing reinforcement. This paper clari es some advantages of the system by discussing several aspects in conventional translation approaches. Currently, Sandglass is being applied to bidirectional Chinese and Japanese spoken language translation involving travel conversation dialogs.
منابع مشابه
Paraphrasing Spoken Japanese for Untangling Bilingual Transfer
One of the problems in spoken language translation is the enormous variety o f expressions not found in text translation. This volume can lead to a sparse translation coverage. In order to tackle this problem, we take the practical approach of untangling slight variations in the source language before transferring a source expression to its target. We therefore discuss how eective paraphrasing ...
متن کاملInteraction between Paraphraser and Transfer for Spoken Language Translation
One of the problems in spoken language translation is the enormous variety of expressions not found in text translation. This volume can lead to a sparse translation coverage. In order to tackle this problem, we propose a machine translation model where an input is translated through both source-language and target-language paraphrasing processes. In this paper, we discuss the source paraphrasi...
متن کاملParaphrasing of Chinese Utterances
One of the key issues in spoken language translation is how to deal with unrestricted expressions in spontaneous utterances. This research is centered on the development of a Chinese paraphraser that automatically paraphrases utterances prior to transfer in Chinese-Japanese spoken language translation. In this paper, a pattern-based approach to paraphrasing is proposed for which only morphologi...
متن کاملBuilding a Paraphrase Corpus for Speech Translation
When a machine translation (MT) system receives input sentences of spoken language, the following two types of sentences are difficult to translate: (1) long sentences and (2) sentences having redundant expressions often seen in spoken language. To reduce these difficulties, we are developing methods to paraphrase input sentences into more translatable ones. In this paper, we report a prelimina...
متن کاملEBMT, SMT, hybrid and more: ATR spoken language translation system
This paper introduces ATR’s project named Corpus-Centered Computation (C3), which aims at developing a translation technology suitable for spoken language translation. C3 places corpora at the center of its technology. Translation knowledge is extracted from corpora, translation quality is gauged by referring to corpora, the best translation among multiple-engine outputs is selected based on co...
متن کامل